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Abstract 

In an all-pay auction, only one bidder wins but all bidders must pay the auctioneer. All-pay 
bidding games arise from attaching a similar bidding structure to traditional combinatorial games 
to determine which player moves next. In contrast to the established theory of single-pay bidding 
games, optimal play involves choosing bids from some probability distribution that will guarantee 
a minimum probability of winning. In this manner, all-pay bidding games wed the underlying 
concepts of economic and combinatorial games. We present several results on the structures of 
optimal strategies in these games. We then give a fast algorithm for computing such strategies for 
a large class of all-pay bidding games. The methods presented provide a framework for further 
development of the theory of all-pay bidding games. 


1 Introduction 

At the conclusion of an all-pay auction, all bidders must pay the bids they submitted, with only the 
highest bidder receiving the item. With this idea in mind, one can play a variant of a two-player game 
using an all-pay auction to decide who moves next instead of simply alternating between players. For 
example, one could play all-pay Tic-Tac-Toe with 100 chips. Each round both players privately record 
their bids and then simultaneously reveal them. If player A bids 40 and his opponent bids 25, player A 
would get to choose a square to mark and the next round of bidding would begin wih player A having 
85 chips, player B having 115 chips. Note that the chips have no value outside the game and only serve 
to determine who moves - the ultimate goal is still just to get three-in-a-row. 

Another variant of the game could have only the player who wins the move pay his/her bid, i.e. 
deciding who moves next using a first-price auction. These games were first studied formally in the 
1980s by Richman, whose work has since then been greatly expanded upon. Intuitively, there is less 
risk in these “Richman games” for the player losing the bid. If your opponent bids 100 for a certain 
move, it makes no difference whether your bid was 99 or 0. All that matters is that your opponent’s 
bid was higher. A surprising consequence of this single-pay structure is that for every state of a game, 
there exists a “Richman value” v for each player that represents the proportion of the total chips that 
player would need to hold to have a deterministic winning strategy. In this situation, the player with the 
winning strategy can tell her opponent what bid she will be making next without affecting her ability to 
ultimately win. For zero-sum games, this means that unless a player’s chip ratio is exactly v, then one 
of the players must have such a winning strategy dp. 0- 

Our objective is to begin the formal study of all-pay bidding games. Returning to the above example 
where your opponent bids 100 chips and you are indifferent between bidding 99 and 0, it is clear this 
is no longer true for an all-pay bidding mechanism. You would be very disappointed had you bid 99, 
as your opponent would be paying just 1 chip on net to make a move. Had you bid 0, though, you 
might feel pretty good about not moving this current turn, as the 100 extra chips may make a bigger 
difference for the rest of the game. Thus, there are at least two bidding scenarios which intuitively 
seem like very good positions to be in: winning the bid by a relatively small number of chips or losing 
the bid by a relatively large number of chips. This behavior suggests that, unlike in Richman games, 
in all-pay bidding games one of the players will not necessarily have a deterministic winning strategy. 
Instead, players must randomize their bidding in some way. Thus, we must appeal to the concept of 
mixed bidding strategies in Nash equilibria. 

1.1 A Game of All-Pay Bidding Tic Tac Toe 

Before presenting formal definitions and results, we provide a sample all-pay bidding game to illustrate 
some of the main features of playing these games. Alice and Bob, each with 100 chips, are playing all-pay 
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bidding Tic-Tac-Toe. Each turn Alice and Bob secretly write down a bid, a whole number less than or 
equal to their total number of chips. They then reveal their bids and whoever bid more gets to decide 
who makes the next move. We say a player has advantage if, when players bid the same amount, that 
player gets to decide who makes the next move. The question of deciding how to assign advantage is 
one we encountered early on. For our games, we give advantage to the player with more chips, then 
arbitrarily let Alice have advantage when Alice and Bob have the same number of chips. A number of 
other mechanisms would also suffice, such as alternating advantage or having a special “tie-breaking” 
chip that grants advantage and is passed each time it is used. Our choice was made in the interest of 
computational simplicity and to eventually allow extension to real-valued bidding. 

First Move. Both players have 100 chips. Alice bids 25, Bob bids 40. Bob wins the right to move 
and plays in the center of the board. 






B 






Second Move. Alice has 115 chips, Bob has 85 chips. Alice wants to win this move to keep pace with 
Bob, but also does not see why it should be worth more than the first, so she only slightly increases her 
bid to 30. Bob, thinking that Alice may want to win this move more, is content to let Alice win and 
collect chips by bidding 0. Alice wins the right to move and plays in the top-left corner of the board. 


A 




B 






Third Move. Alice has 85 chips, Bob has 115 chips. Alice bids 45, Bob bids 40, so Alice wins the 
right to move and plays in the top-center of the board. 


A 

A 



B 






Fourth Move. Alice has 80 chips, Bob has 120 chips. Alice is one move away from winning and 
decides to risk it and bid all of her 80 chips. Unfortunately for her, Bob has guessed her move and has 
himself bid 80 as well. Because Bob has more chips overall, he uses his advantage to win the tie and 
plays in the top-right corner of the board, blocking Alice’s victory and setting himself up for one. 


A 

A 

B 


B 






Fifth Move. Alice has 80 chips, Bob has 120 chips. Bob has more chips and is just a move away from 
winning, so he can bid everything, play in the bottom-left corner and win the game. 

In normal Tic-Tac-Toe, both players can guarantee a draw by playing well, but as we see from this 
example, the result of a game of all-pay Tic-Tac-Toe involves far more chance. 

For example, at the fourth move in the above game, Alice could have guessed Bob might bid 80 and 
chosen to “duck” by bidding 0. In this case Bob would win the move and play as before, but now the 
chip counts would be 160 to 40 in Alice’s favor, and Alice can bid 40 and then 80 to win the next two 
moves and win in the left column. It is easy to see that if a player knows what his opponent will bid 
at each move, he can win the game easily. Thus, in the vast majority of all-pay bidding games, optimal 
play cannot be deterministic. 

Though we do not return to Tic-Tac-Toe in this paper, it served as a test game for much of our 
research. Using our results, we built a computer program to play all-pay bidding Tic-Tac-Toe optimally. 
The program can be played against at http://biddingttt.herokuapp.com. The theory behind this 
program, which is not specific to Tic-Tac-Toe, will be the focus of the rest of the paper. 
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1.2 Overview of Results 


Our ultimate goal is to characterize the optimal strategies for a general class of all-pay bidding games. 
The game consists of iterations of both players bidding for the right to move followed by one of the 
players making a move. In turn, an optimal strategy will also have two parts: the bid strategy and the 
move strategy. For a given position in the game (e.g. a configuration of the Tic-Tac-Toe board) and chip 
counts for each of the players (e.g. Alice has 115 chips, Bob has 85 chips), the bid strategy must tell 
players how to best randomize their bets (e.g. Alice bids 0 chips half the time, 80 chips half the time) 
while the move strategy must tell whoever wins the bid the best move to make (e.g. where to play on 
the Tic-Tac-Toe board). 

The problem of determining move strategy is largely combinatorial in nature and remains similar to 
its analog in Richman games. We can still represent the space of game states as a directed graph, and 
there is a not always a single best move that each player can make upon winning the bid. That is, the 
best move could also depend on each player’s chip counts moving forward. 

The focus of our work, then, will be on determining the optimal bidding strategy for any game 
position and chip counts. Naturally, this should depend on a player’s chances of winning in any of the 
possible subsequent game situations (i.e. after a single move and updated chip counts). For purposes 
of initial analysis, we will assume that these future winning probabilities are known, and see how the 
bidding strategy can be determined from this information. Then, by using the recursive nature of the 
directed graph, we will be able to start from the “win” and “loss” nodes (where the probability is just 
1 or 0) to find the optimal bidding strategies and winning probabilities for any game situation. For the 
rest of this paper, we will often refer to a bidding strategy as just a “strategy” when it is clear that the 
focus is just on the bidding side of the game. Here, a strategy will be a probability vector where the 
ith coordinate corresponds to the probability a player will bid i chips. Further, a Nash equilibrium for 
a game situation will just be a pair of strategies so that neither player has an incentive to deviate. This 
means that each player’s strategy will maximize his/her minimum probability of ultimately winning from 
the next turn of the game. 

It quickly becomes apparent that a naive recursive algorithm using linear programming is feasible 
only for games with very few moves. Thus, in the interest of being able to practically calculate the 
optimal bidding strategies for general games, we prove some structural results on the Nash equilibria. In 
particular, useful structure arises when we study a particular class of games that we dubbed “precise”, 
which roughly speaking are games where having one more chip is strictly better than not. The key result 
is a surprising relationship between opposing optimal strategies that allows one to immediately write a 
Nash equilibrium strategy for the player without advantage if given a Nash equilibrium strategy for the 
player with advantage. 

This relationship, (2.3), which we call the Reverse Theorem, is a critical step toward the calculation 
of optimal strategies for precise games. Further, by assigning an arbitrarily small value in the game to 
each chip, we get a precise game that is very similar to the original game. We show that the optimal 
strategies we can calculate for these new precise games will indeed converge to optimal strategies for our 
possibly imprecise games. Our theoretical results ultimately culminate in a fast algorithm for computing 
optimal probabilistic bidding strategies. Together with a move strategy for the combinatorial side of the 
game, this gives a complete characterization of optimal play for all-pay bidding games. 


2 Strategies in precise games 

Let G a ,b denote a single turn of a two-player all-pay bidding game Q where player A is endowed with 
a chips and player B is endowed with b chips. The underlying combinatorial game Q is a two-player 
zero-sum game, represented by an acyclic, colored, directed graph with two marked vertices, A and B. 
The game begins by placing a token at some starting vertex. At each turn, a player moves the token to an 
adjacent vertex. Player A wins if the token reaches A and player B wins if the token reaches B. By saying 
the graph is colored, this means that edges are one of two colors, say red and blue, such that A can only 
move the token along red edges and B can only move the token along blue edges. To ensure consistency 
in the bidding strategy from turn to turn, we seek to avoid situations where the winner of a bid can be 
put in zugzwang - i.e. where it would be better to not move at all. Thus, the bid winning player, rather 
than simply being able to move next, gets to determine who moves next. With this condition, Q can an 
asymmetric game where zugzwang is possible, like chess and many other popular two player games. 

The payoff, or value of the game, for player A at G a ^ is denoted by VA{G a € [0,1] and is equal 
to the probability that player A wins the game under optimal play. That is, we set va(A) = 1 and 
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v a (J3) = 0 and calculate payoffs recursively. Similarly, let VB(G a ,b ) denote the probability that player 
B wins the game. Often, when the chip counts or specific combinatorial game are not relevant to the 
discussion, the payoffs will be shortened to v a and vb- Note that vb = 1 — va as we only study games 
that cannot end in ties (for the game of Tic-Tac-Toe above, we can arbitrarily let one of the players win 
all draws). 

Thus a payoff matrix for player A in G 0j & is denoted by MA(G a ,b ) and is given by 


( MA)i,j 


max(max G , 6Sl(G) v A {G' a _ j+ih _ i+j ), min G , e5B(G) v A {G' a _ j+i b _ i+j )) if A wins the bid 
min(min G / e s B(G) v A {G' a _ j+i b _ i+j ), max G , £Sj(G) v A {G' a _ j+i b _ i+j )) if B wins the bid 


where Sa{G) and Sb are the set of game positions that can be moved to from G by A and B respectively. 
The (i,j) entry corresponds to player A’s probability of winning the game after A bids j and B bids 
i. Note this is well-defined because the game is zero-sum: by moving to the game state that minimizes 
Player A’s payoff, player B is maximizing his own payoff at the same time (and vice-versa). Similarly, 
let Mb(G 0) 6) denote the payoff matrix for player B. 

We notice that if player A bids x and player B bids y, this is equivalent to player A bidding x + z and 
player B bidding y + z for any z because the players are paying each other. Thus, we have that payoff 
matrices are Toeplitz, or diagonal-constant. We will write player A’s and player B’s payoff matrices for 

G a ,b as 


/ «0 OL\ 

Ol a \ 


( Po Pi • 

• Pb \ 

Ot—l O!o 

\ ot-b a-b+i ■ 

• 1 

• CX— b-\-a ) 

and 

P-l Po 

V P-a P-a+1 • 

Pb-1 

■ P-a+b J 


respectively. 

We pause to consider a simple example. Let the underlying game be one where player A 
make two moves to win, while player B needs to make only one more move to win. Suppose 
has 5 chips while player B has 3 chips. Then we would get the following payoff matrices for 
and player B 

/ 1 1 1 0 0 0 \ 

0 1110 0 

0 0 1110 

\ 0 0 0 1 1 1 / 

respectively 

A strategy for player A in G a> b is given by an (a + l)-dimensional column vector with all non¬ 
negative entries that sum to 1. The i-th entry of this vector (where we start indexing at 0) gives the 
probability that player A will bid i chips. Similarly, a strategy for player B in G 0j b and is given by a 
(6 + l)-dimensional column vector satisfying the same conditions. We denote a Nash equilibrium strategy 
in the game G a ,b as SA{G a ,b) for player A and as jSb(G 0) (,) for player B. Often times we will not be too 
explicit with the size of these vectors. It should be clear from context. 

Note that the it h row of M A corresponds to the payoffs of each of A’s pure strategies if her opponent 
B bids i. Letting A,; be the ith row of Ma, we have Ai ■ Sa = a-io(SA )o + • • • + ai 0 (<SA)a = ( Pa)i , a 
weighted average of A’s pure payoffs when B bids i. Thus, (Pa)i is player A’s probability of winning if 
her strategy is Sa and her opponent purely bids i. For example, if we have 


and 


i w ^ \ 

0 0 11 
0 0 0 1 
10 0 0 
110 0 
V 1 1 1 0 J 


needs to 
player A 
player A 


—(i !)■(!)■(!) 


this means by playing Sa, player A wins | of the time if player B only bids 0 and wins \ of the time if 
player B only bids 1. 

Now, if player B’s strategy is Sb, S^MaSa = (Sb)o(Pa)o + • • • + ( Ss)b(PA)b, another weighted 
average of A’s payoffs for each of B’s pure strategies. Thus, S^MaSa is exactly A’s payoff if she plays 
Sa and her opponent plays Sb- S^MbSb is B’s payoff in the same situation. Continuing with the above 
example, if we now let \ \ ) then S^MaSa = 2 'I + 5'2 = i' S° given strategies Sa and 

Sb for players A and B, player A has a | probability of winning. 

We compile these results in the lemma below. 
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Lemma 2.1. Let Ma and Mb be payoff matrices for players A and B, respectively, in G a ,b- Then the 
following statements are true. 


(a) The diagonals of Ma and Mb are constant, i.e. the payoff matrices are Toeplitz. 

(b) Let 1 be the appropriately sized matrix whose entries are all 1. Then Mb = 1 — Mj. 

(c) Suppose ( Sa,Sb ) is a Nash equilibrium. Then (.MbSb)i = Vb if ( Sa)i 0 and ( MaSa)i = va if 

(S B )i 0. 

This lemma provides the basic structure from which many of our main proofs will follow from later. 

It is clear that VA(G a +i,b) > VA(G a ,b ), since player A can always bid as if he did not have the extra 
chip. We now define a class of games pivotal to our analysis in which this inequality is strict. Formally, 
a game G is called precise if in every successor state to G, it is strictly better to have one more chip. 

Remark 1. We note that in particular, this guarantees a certain strict monotonicity among the entries 
of the payoff matrices. In particular, winning the bid by one less chip is always strictly preferable, as is 
losing by one more chip. Thus we have that for the player with advantage, cq > ay for 0 < i < j and 
a* > a_, for i < j < 0. A similar relationship holds for the player without advantage, except A) < Al and 

A) > /3-i- 

Definition. A strategy S has length £ = £(S) if Si -1 yf 0 and S m = 0 Vm > l. A strategy S is 
gap-free if Si, Sj A 0 if and only if Sk yf 0 Vi < k < j. 

The definition of length encapsulates the observation that unless the game is close to completion, 
players will never bid a large proportion of their chips. The second definition seems more arbitrary at 
the moment, but it plays a pivotal role in the following Proposition and will serve to greatly simplify the 
language throughout the paper. 

Proposition 2.2. Let G a ,b be precise. Any equilibrium strategy for the player with advantage is gap-free 
and bids 0 with nonzero probability, while any equilibrium strategy for the other player is gap-free and 
bids 1 with nonzero probability. If the player with advantage has an equilibrium strategy of length £, any 
equilibrium strategy for the other player has length £ or £ + 1. 

Proof. Suppose without loss of generality that player A has advantage, and let Sa = (so, ■ • •, s a ) and 
Sb = {to, • • • ,tb) be equilibrium strategies for players A and B respectively. We claim that if i > 0, 

(i) Si = 0 implies L + i = 0, and 

(ii) ti + 1 = 0 implies s.j + i = 0. 

If Si = 0 and t i+ 1 > 0, player B should alter his strategy so that he bids i with probability ti + t i+ \ and 
* + 1 with probability 0. This saves player B a chip whenever he would have bid * + 1 without changing 
any possible outcome of these bids, and all other possibilities are unchanged. By precision, this new 
strategy is strictly better than Sb for player B , a contradiction. This proves (i). 

If t, = 0 and Si > 0, player A should alter her strategy so that she bids i with probability Sj + s,;+i 
and i + 1 with probability 0. As in the previous case this new strategy is strictly better for player A, a 
contradiction, proving (ii). 

Together, (i) and (ii) complete the proof except in the case when Sb = (1,0,..., 0). However, in this 
case an optimal strategy for player A is to also bid 0 with probability 1, and it follows that G a ,b is not 
precise. □ 

This characterization of equilibrium strategies is what motivated our restriction to precise games. 
In the presence of precision, an easily observable, yet highly unexpected relationship between opposing 
optimal strategies appears. This relationship forms the foundation for the rest of our results. 

Definition. The reverse of a length £ strategy S is given by 

H(S) = 1l{{s 0 , si,..., sg—i, 0,..., 0)) = (se-i,S £- 2 ,..., s 0 ,0,..., 0). 

where the number of trailing zeroes will be clear from context. 

Theorem 2.3. Suppose that G a ,b is precise, and that S is an equilibrium strategy for the player with 
advantage. Then B{S) is an equilibrium strategy for the player without advantage. 
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Proof. Suppose without loss of generality that player A has advantage, and S = Sa = (sq, ..., se~i, 0,..., 0) 
has length £. By Lemma |2.1| and Proposition |2.2[ we have 


M a ■ S A 


ot 0 

Oti 

■ ■ 

a-i 

c*. o 

. . OL a — 1 

at-b 

<*1-6 • 

• • &a—b 


So 


w 0 

Si 


va 

St -1 

= 

VA 

0 


Wp 

0 


W b 


( 2 ) 


where wo,wp,... ,Wb > va- We claim further that wq = v a- 

Suppose for a contradiction that wq > va- Then if Sb is an equilibrium strategy for player B , by 
Lemma|2.1|and Proposition [T2] it is of the form Sb = (0, ti, t 2 , ■ ■ ■, tp, 0,..., 0) where t\ > 0, but possibly 
t/ = 0. 


When played against Sa, Sb gives a payoff of vb- Let v 'b be player B's payoff against Sa when he 
plays the shifted strategy S' B = (ti, ■ ■ ■, tp, 0,..., 0). Since (Sa, Sb) is a Nash equilibrium, v' B < vb- 
On the other hand, player A can guarantee a payoff of 1 v' B against S'B by using the strategy S' A = 
(0, so, Si, ■ • ■ , sp- 2 , sp- 1 ,..., 0) since the probability of any given difference in bids occurring is the same 
in (S' a ,Sb) as in (Sa,S' b ). Therefore v' B > vb, so vb = v' B , whence S B ■ Ma ■ Sa = S B ■ Ma ■ Sa- 
Expanding this, we find 


0 • Wq + ti • Va + ‘ ‘ • + tp-l ■ Va + tp ■ Wp — ti ■ Wq + t2 ■ va + ■ ■ ■ + tp ■ Va 


Suppose wp > va- Then, we must have tp = 0, which solves to get w o = va- If wp = va, the equation 
solves the same way to get wq = va- Thus, either way we have a contradiction of wq > va- Thus, 
wq = va- Together with ([2]), this gives 


va — OqSo + ■ ■ ■ + ap-pSp-i — ■ ■ ■ — + ■ • • + oiosp-i. 


(3) 


sp-i 

So 
0 

0 

For 0 < i < t-\ we have (l-aj)s*_id-h(l-ai_£ + i)s 0 = (soH- (cti-p+pSoA - \-ctiSp-i) = 

1 — va = vb by equation ([3|. In other words, TZ(Sa) guarantees player B his highest possible payoff 
against Sa, so he has no incentive to deviate from TZ(Sa) if player A uses Sa- 

We now show player A has no incentive to deviate from Sa against TZ(Sa)- If i < i < a, the payoff 
for player B if player A bids i will be (sq + • ■ ■ + s^_i) — (ap-p+iSo + • • • + apSp- 1 ). By the formulation 
of precision in terms of payoff matrices in Remark [l] we have strict inequalities cti-p+i < ao through 
dp < ap-\, so di-p+iSo + • ■ • + oiiSp-i < aoSo + ■ ■ • + ap-\sp-\ = va- Thus player A loses utility if she 
bids any amount greater than l — 1 with positive probability. One also readily sees that if she alters her 
distribution of bids 0,..., l— 1 this will not change her payoff against TZ(Sa)- It follows that (Sa, TZ(Sa)) 
is a Nash equilibrium as claimed. □ 


By Lemma 2.1 we have Mb = 1 — Mj, 


so 


M b ■ K(S a ) = 


— <* 0 

1 — a_i 

.. 1- 

- a_b 

- <*1 

1 - ao 

.. 1 - 

ai_b 

<*a 

1 CX a —i 

.. 1 - 

d-a—b 


The Reverse Theorem reveals a strong relationship between opposing player’s strategies. Using it, 
we can now fully characterize the set of optimal strategies for both players in precise games. 

Theorem 2.4. If G a ,b is precise, the player with advantage has a unique equilibrium strategy. 

Proof. Let player A have advantage. Suppose that S A and S' A are distinct equilibrium strategies for 
player A. Let Sa and S' A have lengths £ and £' respectively. By the Reverse Theorem, player B has 
strategies IZ(Sa) and TZ(S' A ) which have lengths £ and £' respectively. Suppose £' ^ £. Assume, without 
loss of generality, that £! > £. Then 7 Z(S' A ) is a Nash equilibrium strategy for B with length greater than 
S A which contradicts Propostion |2.2| Thus, £ = £’. 
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Assume, without loss of generality, that ( M A S A )e > ( MaS' a )(.. That is, we assume, that if B bids i 
against Sa he will do no better than if he were bidding (. against S' A . It is possible he will do strictly worse 
as bidding t is not necessarily a part of player _B’s optimal strategy. Consider the following function: 

S(x) = S' A + x(S A - Sa) 

We claim that for any x for which S(x) is a valid strategy, S(x) is an optimal strategy. Note that S(x) 
has entrywise sum of 1 so S(x) is at least valid for x G [0,1]. Consider: 

{M A S(x))i = (M A S A ) l + x(M a S a - M A S' A )i 

For i < ( M A S A )i = ( M A S A )i = v A so (. M A S(x))i = v A . For i = t, ( M A S A )i > (. M A S' A )i so 
(. M A S(x))i > ( M A S' A )i > va- If player B bids anything greater than l then he will do strictly worse 
than if he bid £, because he will win by more than he would by bidding £. Therefore, S(x) guarantees 
player A a payoff of at least va- Choose the maximal x* for which S(x*) is valid. Because S(x) has 
entrywise sum of 1, it is only invalid if S(x) has a negative entry. Thus, at this maximal S(x*) has at 
least one zero entry. Either S(x*) has length less than £, a 0 in its first entry, or is not gap-free. Each of 
these is impossible (above, Prop |2.2| ). Therefore distinct optimal strategies Sa and S A cannot exist. □ 

In most precise games, both players have unique optimal strategies. It is possible, however, to 
construct a game in which the player without advantage has multiple optimal strategies. We give a 
characterization of these as well. If S is a strategy let (0, S) represent a new strategy where anytime one 
would bid i in S he will bid i + 1 in (0, S). 

Theorem 2.5. Let G a & be precise and let player A have advantage. The following statements hold: 

(1) Player B has a unique strategy of minimal length. This strategy is TZ(Sa)- 

(2) If Player B has more than one optimal strategy, then another optimal strategy is of the form 

(0, TZ(Sa))- 

(3) All other optimal strategies for player B are of the form 

tn(s A ) + (i-t)(o,TZ(s A )) t g [o,i]. 


Proof. Throughout this proof we will use a method from the proof of Theorem 2.4 
two strategies P and T such that wherever P is non-zero so is T. Then we define 


Suppose we have 


E(x) = T + (P — T) a 


We showed above that E(x) gives an optimal strategy as long as it is valid. If we choose x* to be maximal 
so that E(x*) is valid, then E(x*) gives an optimal strategy with a 0 in some spot where S was nonzero. 
Let us call the strategy produced by this method E(P,T). 

We begin with (1). By the Reverse Theorem, player B has a strategy TZ(Sa) which is of the same 
length as Sa- By Proposition 2.2 player B cannot have a strategy shorter than Sa- Therefore, IZ(Sa) 
is a strategy of minimal length for player B. Suppose S is another strategy of minimal length for player 
B. Then S* = E(S,1Z(Sa)) is either of lesser length, is not gap-free, or has a 0 in the first entry. The 
first two possibilites are impossible by Proposition |2.2| In the third case, we can apply the same method 
again to get E(S, S*) which is either of lesser length, not gap free, or has 0’s in the first two entries. 
Each of these is impossible by Proposition |2.2| 

We now proceed to (2). Suppose player B has more than one optimal strategy. Then by (1) it must 


be of length greater than TZ(Sa)- Let i be the length of TZ(Sa)- By Proposition 2.2 any other optimal 


strategy of player B must be of length l + 1. Let S be such a strategy. Suppose Sq ^ 0. Then we can 


take S' = E(TZ(Sa), S) which must have a 0 in the first coordinate lest we contradict Proposition 2.2 
We must show that S' = (0,7 Z(Sa))- Because Mb is Toeplitz, 


(Mb ■ (0 ,IZ(S A ))) i+1 = (Mb ■ 7 l(S A )) i 

Therefore (0,7 Z(Sa)) guarantees player B at least his optimal payoff unless player A plays 0. Suppose 
that if player A bids 0 then (0,7 Z(Sa)) gives player B a payoff of v less than his optimal payoff of Vb- 
Then define a strategy, 

_ S'~ c( 0,K(S a )) 
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for c sufficiently small so that S' — c(0,TZ(Sa)) has all positive entries. Then S A is a valid strategy that 
guarantees player B his optimal payoff if player A bids anything from 1 to t + 1. It guarantees player B 
more than his optimal payoff if player A bids 0 as: 


G, s' - c(o,n(s A ))\ i , w i , 

M b ■ - - - = --• (v B - cv) > --• ( v B - cv B ) = v B 

\ 1 — C J 0 1 — c 1 — c 

S A is a strictly better strategy than S' as player A always bids 0 with nonzero probability. S' is optimal 
so this is impossible. Thus, (0,1Z(S A )) is an optimal strategy. That it is equal to S' will follow from (3). 

Finally we prove (3). TZ(Sa) and (0,7 Z(S A )) are optimal strategies so any convex combination of the 
two is optimal. Let S* be an optimal strategy for player B that is not a convex combination of the two. 
Then, S* must be of length £+1. Therefore we can take £’((0, TZ(Sa)),S*). This gives a strategy which 
is either of length £, is not gap-free, or has multiple 0’s at the begining. The latter two possibilities are 
impossible by Proposition |2.2| 1Z(S A ) is the unique optimal strategy of length £ so: 

7 Z(S A ) = (0, K(Sa)) + x(S * - (0,7 l(S A ))) 

-K(Sa) + ^ (0,7Z(S A )) = S* 

X X 

Note that f ^ = 1 and both coefficients must be postive or else the first or last entry of S * will be 
negative. Thus, S* is a convex combination of 7 Z(S A ) and (0,7 Z(Sa))- □ 


3 Imprecise Games 

3.1 Adjustments for Precision 

In most of the above proofs we assume that G a ,b is a precise game. In many games with small associated 
graphs, this is not the case. The simplest example is a game where in the associated graph the only 
directed edge goes to A. Then player A always wins, so the chip counts do no matter whatsoever. Thus, 
we apply a small adiustment to the payoff matrices for players A and B. Pick a small x > 0. We now 
define M%(G a , b ) as 

M A {G a ^b) = M A (G a , b ) + xB a b 

where B a b is given by the (b + 1) x (a + 1) Toeplitz matrix 


a 

a — 1 

... 0 

CL -hi 

a 

... 1 

a + b 

a + b — 1 

... b 


Intuitively, we can think of xB a ^ as a payoff matrix that gives payoff x for each chip a player has at the 
end of a turn. S x A (G n h) is then given by the strategy that maximizes player As minimum payoff under 
V A (G., b ) t Lpayol. ' 

While the payoff no longer corresponds exactly to winning probability, the game G x b is still zero-sum, 
with total utility 1 + (a + b)x split between the two players. We generalize our Lemma 2.1 to this new 
game: 


Lemma 3.1. The game represented by AI\{G a ^) is precise. 

Proof. Each entry of M^{G a ,b) represents a successor state of the game where each player has some 
number of chips. From the way we have defined B a b , for any successor state in which having one 
more chip provided an equal payoff in G a! b, having one more chip will now provide a payoff exactly x 
greater. □ 


A natural question arising from this adjustment is whether or not it gives a good approximation of 
the actual payoff for G a ,b and the actual Nash equilibria. The following theorem shows that by choosing 
a small enough x, M\\,v\, and S\ can be made arbitarily close to M a ,v a and some Nash equilbrium 
strategy Sa- 
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Theorem 3.2. With S A as described above, 


( 1 ) 

( 2 ) 

( 3 ) 


lim M%(G a , b ) = M A (G a b ), 

x— >-0 

lim v A (G a , b ) = v A (G ajb ), 

x—>-0 

lim S A (G ab ) = S A (G a b ). 

x— >-0 

Proof of (1) and (2). We notice that (1) follows directly from the definition of M\ 

lim M%(G a>b ) = lim (M A (G a b ) + xB ab ) = M A (G a b ) 

x— >-0 x— >-0 

We now consider (2). We can define three functions: 

v x A (G a , b ) = min (M%(G a>b ) ■ S x A (G a , b ))i = g{x) 

i 

v x A (G a , b ) = min (M%(G a>b ) ■ S%(G a , b ))i 

i 

= min (M A (G a>b ) ■ S x A (G a , b ) + xB • S x A (G a , b ))i 

i 

< min (M A (G a>b ) ■ S%(G a , b ))i + ma x(xB ■ S x A (G a , b ))i 

i i 

< min (M A (G a>b ) ■ S A (G a , b ))i + ma x(xB ■ l)j 

i i 

= v A (G a>b ) + ma x(xB ■ 1 ) 4 = h(x) 

i 


v x A (G a , b ) = min (M%(G a , b ) ■ S x A (G a , b ))i 

i 

> min (M%(G a , b ) ■ S A (G a , b ))i = f{x) 

i 

Notice that for all x > 0, f(x) < g(x) < h{x). We also see that 

lim f(x) = lim min (M A (G a , b ) ■ S A {G a , b ))i = min(MA(G a ,6) • S A (G a , b ))i = v A (G a , b ) 

x—tO tc—>-0 i i 


lim hi x) = lim v A (G a b ) + maxlxB ■ 1 )j = v A (G a b ) + ma x(B ■ 1), • lim x = v A {G a b ) 

tc —>-0 x —>-0 ’ i ’ i tc —>-0 


Therefore, 


lim h(x) = lim v A (G a , b ) = v A {G a , b )- 


□ 


This leaves (3), the proof of which is more nuanced. We must first develop some more theory of 
all-pay bidding games. 


3.2 Restricted Games 

In many bidding games, the random distribution governing optimal play does not involve bidding above 
some threshold. In a game of Bidding Tic-Tac-Toe where each player begins with 100 chips, a player 
should not bid 100 on the first turn. By the Reverse Theorem, the two players, have optimal strategies 
of equal length. Suppose in some bidding game G a , b , both players have strategies of length t. Then we 
can consider the restricted game, G a , b \ where both players can bid at most £ — 1 on the first turn 
and play returns to normal thereafter. In such a restricted game players are still able to play the length 
£ optimal strategy they would have employed in the original game. Is this strategy still optimal? 

Lemma 3.3. If S a ,Sb are optimal length £ strategies in G a ^ b that provide the payoffs v A and 1 — v A 
respectively, then they are optimal in G a b \ £ and provide the same payoffs. 

Proof. M A (G atb | £) is the £ x £ top-left minor of M A (G a , b ) as the games are identical after the first 
move. Thus, both players bidding less than £ in G a b is equivalent to the players making the same bids 
in G a , b | £■ Thus, M A (G a , b \ £) ■ S A gives the first £ entries of M A (G a , b ) ■ S A . The minimum entry of 
M A (G a ,b) ■ S A is v A so the minimum entry of M A (G a , b \ £) ■ S A is at least v A . Thus, S A guarantees at 
least the payoff v A . Using the same logic for Mn(G a , b \ £), we obtain the Sb guarantees the payoff at 
least 1 — v A . The total payoff is exactly 1 so player A gets payoff v A and cannot do better and player B 
gets the payoff 1 — v A and cannot do better. □ 
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Furthermore, recall that precision is a characteristic of the successor states in a game. The possible 
successors of a restricted game are a subset of the successors of the normal game. Thus, if a game 
is precise then its restricted game is also precise. We are now able to state a powerful result for the 
restricted game that will allow us to prove some important results for general bidding games. 

Lemma 3.4. In a precise game G a 5 , if S A , an optimal strategy of minimal length, has length l, then 
Ma{Go, tb | I) is invertible. 

Proof. Suppose by way of contradiction that there exists y such that AlA{G ab \ €) • y = 0. Define 
y £ R a by yt = yi for 0 < * < I — 1 and yi = 0 for i > I. Then AlA(G a , b ) ■ y is a vector with 0 in it first 

I entries. In particular, ( AlA(G a ,b ) ■ y) 0 = 0. S A has all positive entries so there exists c £ R such that 

S+ = Sa + cy and S- = Sa — cy have all positive entries. We note that: 

(Ma{Go,^) • S+)i = (M A (G a ,b) ' S A )i + (AfA(G a ,b) • cy)i = v A + 0 = v A 

{M A {G a>b ) ■ S_)i = ( M A (G atb ) • S A )i - (Af A {G a b ) ■ cy)i = v A + 0 = v A 

for 0 < i < I — 1. Suppose the sum of the entries of 5+ is less than 1. Then there exists k > 1 such 
that the sum of the entries of kS + is equal to 1. Then kS+ is a valid strategy for player A. that gives 
payoff kvA > va against player B' s first I pure strategies. Thus, against IZ(Sa), kS+ is better than Sa 
so (Sa,B(Sa)) is not a Nash equilibrium. Contradiction. Then suppose the sum of the entries of S + 
is greater than 1. Then the sum of the entries of S_ is less than 1 so the same argument holds. Then 
suppose the the sum of the entries of S+ equals 1 . Then 5+ and Sa are optimal in G a , b \ G a , b \ (■ is 
precise, however, so there exists only one optimal strategy of minimal length for either player in G atb \ I- 
Therefore y must equal 0. □ 

A method for computing optimal strategies for the player with advantage, say player A, now becomes 
apparent. Given the length of the player’s unique optimal strategy we can consider the payoff matrix 
of the restricted game. By the Reverse Theorem, player B has a gap-free strategy of the same length. 
Then the restricted payoff matrix multiplied by player A’s optimal strategy must give a constant vector. 
The inverse of our restricted payoff matrix multiplied by some non-zero constant vector will therefore 
give a scalar multiple of player A’s optimal strategy. 

Theorem 3.5. Let player A have advantage. In a precise game G a , b if Sa has length t then 

MAiGatli)- 1 ! 

A 1 T Al A (G a ,b I £)-! 1 

Proof. As discussed above AlA(G a ,b \ ■ 1 is a scalar multiple of Sa- The sum of the entries of 

Sa is 1 so we need only divide by the sum of the entries of AlA(G ab | I) -1 ■ 1- This is given by 

l T AI A (G a , b ' □ 

This theorem gives an explicit and rapid method for computing optimal strategies for a player with 
advantage. Combined with the Reverse Theorem, we will be able to develop a method for computing 
optimal strategies for both players in any simple bidding game. First, we will return to (3) of Theorem 

EH 


3.3 Convergence of Strategies 


Recall our conjecture that as x —> 0, b ) —>• SA(G a , b ). The above theorem gives even more weight 

to this claim as together they give a method for approximating optimal strategies for imprecise games 
via a convergent sequence of strategies for precise games. 

We begin by partially extending the invertibility of the restricted payoff matrix to imprecise games. 
The importance of this result is not immediately obvious, but it will be integral to the proof of part (3) 
For simplicity, we will sometimes write AlA(G a , b 


of Theorem 3.2 


I) as Ma{I) and AlB(G a , b \ () as 


Proposition 3.6. If player A has a length I optimal strategy for M \| = AlA(G* b ) then at least one of 
AIa{G a b | I) and AI B (G a , b \ £) = 1 — AlA(G a ^ b \ £) T is invertible. 
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Proof. For simplicity, let Ma = MA(G a ,b I £) and Mg = MB(G a ,b \ £)• If Ma is invertible, we are done, 
so suppose Ma is not invertible. Let tu / 0 be in the nullspace of Ma- Because 51 is gap-free, there 
exists c > 0 sufficiently small such that 51 ± cw are valid strategies for player A. Then 

M% • (51 ± cw) = M%Sa ± (M a + xB) ■ cw = M%S% ± cxBw. 

Each successive row in B is 1 greater in each entry than the previous row. Suppose that the sum of the 
entries of w is equal to 0. Then, 


(Bw ) i+ 1 = (Bw)i + ( 1 ,..., 1 ) • w = ( Bw)i 
Thus, Bw is a constant vector. If Bw = 0 then 

M^ ■ w = Maw + xBw = 0. 


By Lemma 3.4 Ml is invertible so Bw cannot equal 0. Therefore, either 51 + cw or 51 — cw results in 
a better payoff for player A than 51 for Ml contradicting the optimality of 51- Therefore the sum of 
the entries of w is not 0. 

We can then let the sum of the entries of w be equal to 1. Then 

w T (l — Ma) = (1,...,1). 

We will return to w momentarily. We can compute that Mg = 1 — Mj + xB. Since 51 is a Nash 
equilibrium, 

(5!) T • Mg = (v,... ,v) 

(51) t • (1 - M a (£) t ) + x(S\) t B - (v ,..., v) = 0 
(5!) t • (1 - M a (£) t ) + (d + x(i-l)-v,d + x(t-2)-v,...,d-v) = 0 
where d = x(S^) T ■ (0,1,..., l — 1) T . Then we can substitute w into the equation: 

x(i - 1 ,.. •, 1 , 0 ) = -(5!f • (1 - M a (£) t ) + (v- d)( 1 ,..., 1 ) 

= -(5!) t • (1 - M A (l) T ) + (v- d)w T ( 1 - M a (£) t ) 

= (-(S x A ) T + (v-d)w T )(l-M A (e) T ). 

Let ?'o = x(— (51) T + (v — d)w T ) and n = r 0 + xiw T so that: 

n(l - Ma(£) t ) = x(£ - 1 -I- i ,..., 1 + i, i) 

Let I? be a £ x £ matrix with rows rg,, re- 1 . Then 

R( 1 - M a (£) t ) = xB. 

Therefore through this seemingly arbitrary construction we obtain that 

(Itxt + R)( 1 - M a (£) t ) = 1 - M a (£) t + xB = Mg 


which is invertible by Lemma 3.4 Thus, 1 — Ma (£) 2 is invertible. 


□ 


The last several results have dealt with payoff matrices of restricted games. The payoff matrix of a 
restricted game is, by definition, dependent on the length of a player’s optimal strategy. The following 
lemma further demonstrates the relevance of the lengths of the players’ optimal strategies. 

Lemma 3.7. If there exists £o and X\ > Xq > 0 such that for all Xq < x < x\, £(Sa(G “ b )) = £q then 

Inn S A (Gl h ) 

X^Xq ’ 


exists and is an optimal strategy. 
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Proof. Let F a ^ = G x a a b . We will treat F a ^ as imprecise so the proof holds for both precise and imprecise 


games. By Proposition 3.6 at least one of Ma = Ma(F a .b | ^o) and Mb = Ms(F a ^ \ £ 0 ) is invertible. 


Suppose Ma is invertible. Then the limit 




i-ii 


(M a ) 


-ii 




1 t ( Ma )~ 1 1 


= S 


exists. As x goes to 0, Sl(F a ,b) is nonzero and has entry wise sum of 1. Thus, S is all nonnegative and 
also has entry wise sum of 1. Finally, 


va = hm v x A = lim min (M\ ■ SI ) = min(M^ • S) 

x —>-0 x —>-0 


Thus S is optimal. If Ma is not invertible, then Mb is invertible. By the Reverse Theorem, for all 
xq < x < xi, £(Sb(G x b )) = l. Therefore, we can apply the same argument as above to S^(G a ,b)- □ 

While the above lemma’s potential power is clear, we have not yet demonstrated that the conditions 
it requires are met by any games. We need some restrictions on the length of optimal strategies as we 
adjust chip value in order to effectively use the above results. The next lemma and its corollary give us 
the necessary structure. 

Lemma 3.8. Let player A have advantage. Let £q = max x£ K >0 £(S A ). The set of p such that £(S P a) = £q 
is open in K>o- 


Proof. The length of S A is an integer and is bounded above by a. Hence £q exists. Pick some p so that 
£(S A ) = V-o- Suppose there exists no e,S > 0 such that for all p' G N^s(p) = (p — e,p + S ), we have 
£(S p a ) = £ 0 . Then we can define a sequence {a;*,} — > p by Xj~ £ N 1 / k l / k (p) so that £(S A k ) < £ 0 . There 
exist only a finite number of possible values for t(S A ) so there must be at least one £-\ < £o such that{ife} 
has a convergent subsequence {x ak } with £(S A k ) = £i for all k. 


By Theorem 3.2 


lim M A k (G l 

k—foo 


,6 | 4 ) = M v A (G a , b | £x), 


Then 

lim S x f 

k—> oo 

for which we have that 


lim Va 

k—yoo 


(G a ,b |4) = 


v p A (G a , b 


l). 


= lim ((M A k (£,))- 

k—> oo 


(v x A k U 1 )) = (M p A (£ 1 ))~ 1 .(v P M = S 


min(M^ • S) = lim vaxa(M A k ■ S A k ) = lim v x A = v A 

k—> oo k—foo 


S is then an optimal strategy in G p b . S is the limit of length £\ strategies so it has length at most £\. 
Therefore S S A . The player with advantage has exactly one optimal strategy so an appropriate open 
neighborhood must exist. □ 

Lemma 3.9. Let £q be as above and let Ma be invertible. Also assume S A is of constant-length on some 
interval ( a,b ). Then there exists vectors S,T such that for all x € ( a,b), 


S% = S + xT. 


Proof. Let 

MX 1 ! 

VM^l 

Note that S is not necessarily optimal or even a valid strategy. It satisfies two notable properties. The 
sum of the entries of S' is 1 and MaS is a constant vector. Consider, 

(M x a - Ma)(S x a -S) = xB(S x a - S) 

xB(Sa — S) is a constant vector as each row in B differs by a vector of all l’s from the row above it. A 
vector of all l’s multiplied by S A — S is 0 as both Sa and S have entrywise sum of 1. Let this constant 
vector be denoted u. Then 

(M X A - Ma)(S x a - S) = u 
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M%S\ + M a S - M a S x a - M x a S = u 
M%S% + M a S - M a S x a - M a S - xBS = u 

Note the M A S terms cancel, and that M A S A is a constant vector. Thus, because u is also a constant 
vector, we know that M A S A + xBS is a constant vector, which we call v. Then 

M a S x a + xBS = v 

S x A = (M A )-\v-xBS) 

Note that M A v is a scalar multiple of S. Let this scalar be c. We have the relation: 

S X A = cS - xM a 1 BS 

We see that c is a function of x, and must be the unique scalar that causes cS — xM A ' BS to have 
entry wise sum of 1. Thus c is given by: 


i =0 


(cS — xM A l BS)i = 1 

a 

Y / S = i + xJ2(mx 1 bs) 1 


i =0 


2 — 0 


c = i + xJ2(m a 1 bs) 1 

2=0 


J2i=o is a constant because M Al B, and S are. Let it be denoted r. 

S X A = (1 + rx)S - xM^BS = S + x(rS - M^BS) 


rS — M a x BS is a vector independent of x. Let it be denoted by T. Thus, 

S X A = S + xT 

Thus, on an i-interval on which S A is of constant length, S A is given by S + xT. Further, each entry of 
S A is given by a linear equation S.; + xTi. □ 

Note that the above lemma does not make use of anything specific to player A or B. Thus, it also 
applies to S x 3 if the necessary conditions hold. 

Corollary 3.10. Let player A have advantage. Let do be as above. Then there exists Xq > 0 such that 
for all 0 < x < Xq, £{S a ) = £q. 


Proof. Let x £ R>o be chosen such that i(S A ) = £q. By Proposition 3.6 at least one of M A and Mg 


is invertible. Suppose first that M A is invertible. By Lemma 3.8 there exists an open interval (a, b) 
containing x on which S A is constant-length. Let (a, b) be the largest such open interval. We are able 
apply the above lemma. There exists vectors S, T such that for all x £ (o, b) 

S X A = S + xT. 

For x > 1, the value of a chip is greater than the value of winning the game so neither player will ever 
bid more than 0. Thus, b < 1. Suppose that a > 0. On this interval S A is given by S + xT for some 
S,T. Therefore, the (£ 0 )-th entry of S A is either strictly increasing, strictly decreasing, or constant. By 


Lemma 3.7 and the uniqueness of Nash equilibrium strategies for the player with advantage. 


lim S X A = S a A 


lim S X A = S b A 

x^rb 


If S A or S A have length then by Lemma 3.8 there is an open interval about a or b respectively on 
which optimal strategies have length £ 0 so (a, b) is not maximal. Thus, both S A and Sg must have length 
less than £q. This implies that: 

lim (S A )(. 0 -i = 0 = Si + aTi 
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lim(51) 4 _ 1 = 0 = S l +bT. l 

x^b 

A linear equation has at most one zero unless Si and T, are both 0. S', + xTi = (S A )e 0 ^ 0 however so 
this cannot be the case. Therefore a must be equal to 0. Then S A is constant-length on some interval 
which has 0 as an endpoint. 

Now suppose that Mb is invertible. We can perform the same operations on the optimal strategy 
of minimal length S for player B and then apply the Reverse Theorem to achieve the same result for 
player A. □ 

Given this structure, we can complete our discussion of convergence. 

Theorem 3.11. Let player A have advantage. Then 

lim S A (G* b ) = S A (G a b ) 

x —>-0 ’ 


exists and is optimal. 


Proof. By Corollary 3.10 there exists Xq such that for all 0 < x < Xq, d(S A ) 
necessary conditions to apply Lemma |3 . 7| which gives the result. 


£o- These are the 
□ 


4 Computing the Optimal Strategy 

Although we have developed results on the structure of optimal bidding in all-pay bidding games, we 
have yet to fully describe how these optimal strategies can be found. In this section, we put together 
our results for precise games with our convergence results for imprecise games to give an algorithm to 
calculate the optimal bidding strategy for any state in an all-pay bidding game. 


4.1 Main Algorithm 

In this section, we will discuss the algorithm we developed to quickly calculate an optimal strategy. 
Our algorithm first assigns to each chip an arbitrarily small but positive value x. This adjusted game 
is precise, so we will be able to take advantage of the structure we have shown for precise games. In 
particular, we will be able to use Theorem |3.5[ which gives a formula for the unique bidding strategy 
belonging to the player with advantage, in terms of the payoff matrix and optimal length: 

M A (G a , b | f.)- 1 • 1 
A 1 T • M A (G a>b K)-l ■ 1 


From the convergence results in the previous section, the resulting strategy will be able to approximate 
an optimal strategy for player A in an imprecise game to any desired degree of accuracy. Note that this 
strategy is not guaranteed to be a unique optimal strategy in the unadjusted game if the unadjusted 
game is not precise. Once S A is known, we know by convergence that 1Z(S A ) will have to be an optimal 
strategy for player B. (S A ,1Z(S A )) is then within any desired degree of accuracy of a Nash equilibrium 
for the unadjusted game. 

For now we will assume the payoff matrix is known. Then, to implement Theorem |3.5| we just need 
to invert the appropriate minor of that matrix, multiply by a vector of l’s, and rescale so that the entries 
of the resulting vector sum to 1. The problem now is to find this optimal length in a precise game where 
the payoff matrix is given. The next two lemmas will allow us to use binary search to find the optimal 
length quickly. 


Lemma 4.1. Let the game be precise. Let 1& be a vector of all 1 ’s. Then for all 1 < k < i, M A (k) 1 ■ 1& 
will have all nonnegative entries. 


Proof. By similar reasoning as in Lemma 
naturally consider the game G ab 


3.4 


will be invertible for all k < t. We 


we know that M A {k) 1 
k. Let S A and Sg be A’s and B' s optimal strategies in this game. 


Note that if k = l, then by definition of l, we have that M A (l) ■ S A gives a constant (nonnegative) vector, 
so ■ 1 will be S A scaled by \/v A . This will have all nonnegative entries because S A is a strategy. 

We can extend this reasoning to when k < £ if we know that S A still has length k , as it must also give 
some constant payoff, v A , in G a ,b \ k. 

Suppose S A does not have length k. Then S A has length to < k < i. Let v A be the value of 
G a b | k for player A. Suppose that v A > v A . Then, we can make strategy S' A for player A in G a ^ b , by 
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extending S\ to the full game, where (SU)i = {S\)i if i < m — 1, and is 0 otherwise. Then, note that 
(A-f 4 • > d .4 if i < m — 1. Because m < k, m < k — 1 where A: — 1 is the maximal number 

of chips useable in the G a ,b \ k game. Thus, if i = m, (Ma • S A ) m > v A > va by definition of Nash 
Equilibrium for G a b \ k. This is A’s payoff against S' A if B purely bids m. 

But since A’s maximal bid in S' is m — 1, that means if B uses a pure strategy where she bids i > m 
chips, she will just be winning the same bids by more chips, which cannot be better in any way. Thus, 
(Ma ■ S A )i > (M a ■ S A ) m > v A > va- Thus, for all 0 < * < £ — 1, (Ma • S' A )i > va, so S A is a Nash 
Equilibrium for G a j, as well. But S A has length m < £, so it would have to be distinct from Sa because 
it has a different length. This cannot be the case as A’s optimal strategy is unique. Thus, we have a 
contradiction and S A cannot have length less than k. 

Thus, S A has length k, so by the same argument as the k = £ case, all the entries of M A (k)~ 1 • 1 are 
nonnegative. Note that because none of the above reasoning depended upon player A having advantage, 
if v A < va, we can apply the above argument from player B’ s perspective. □ 

Lemma 4.2. Let the game be precise. Let 1^ be a vector of all 1 ’s. Then for all k > l, either M A (k) is 
not invertible or ■ lj, will have all nonnegative entries. 

Proof. Assume M A (k) is invertible. 

We begin by showing there is no valid length (£ + l)-strategy for player A that produces the same 
payoff for player B’s first £ + 1 pure strategies. Suppose there does exist such a strategy S. Let v be 
the payoff that S produces against player B’s first £+1 pure strategies (pure bids from 0 up to £). Note 
that because Sa has length £ there is a Nash equilibrium strategy TZ(Sa) of length £ for player B. We 
consider three cases: 

(1) v > v A 

Since B bids at most £ — 1, we only need to consider the first £ coordinates of Ma • Sa and Ma • S. 
By our assumption, v > va so S is strictly better than Sa against TZ(S A ). Thus Sa cannot be a 
Nash Equilibrium strategy, which is a contradiction. 

(2) v < v A 

If v < va then let player B use the strategy TZ(S). It is easy to verify that TZ(S) produces the 
payoff 1 — v > 1 — v a against player A’s first £ + 1 pure strategies. Thus, by similar reasoning as 
in the previous case, 7 Z(S) is strictly better than 1Z(Sa ) against Sa, so TZ(S a ) cannot be a Nash 
Equilibrium strategy, which is a contradiction. 

(3) v = v A 

If S = (so, ..., se, 0,. .., 0) T , then expanding the first £ + 1 coordinates of M A ■ S results in the 
equations atso + • • • + cti+gse = va for i = 0,..., —£. Considering the game from player B’s 
perspective, note that 7 Z(S) gives B a payoff of 1 — v A against A’s first £ + 1 strategies. In 
particular, B’s payoff against A bidding £ will be 

(1 - az)si + (1 - a 0 )s 0 = 1 - (aese H-1- a 0 so) = 1 — va- 

B’s payoff x against against A bidding £ + 1 will be 


x — 1 — (o^+i st, + aiSo). 

Note that because player A is winning ties, a^+i < at,... ,a i < ao, as in each case A is winning 
by one more chip. Thus, aest + • • • + ao^o > c^+iSf + OiSq which means 


1 - va = 1 - (atst H - b a 0 s 0 ) < 1 - (at+\St + ais 0 ) = x. 

Similarly, B’s payoff if A bids anything greater than k will be greater than 1 — va- Thus, TZ(S) is a 
Nash equilibrium strategy of length k for player B. Note that if k is at least £ + 2, then 1Z(S) will 
have length at least £ + 2, which will be a contradiction if S A has length £. Since k > £, this means 
we must have k = £+ 1. Then, 1Z(S) is a Nash Equilibrium strategy of length £ +1, so by Theorem 


In turn, S must be of the form A(0, so ,..., st-i, 0,..., 0) + (1 — A)(so,..., st-i, 0,0,..., 0). We can 
now write M A (£+ 1) • S = v A 1i+i as M A (£+ 1) • A(0,s o . • • • + (1 - A)(s 0 ,... ,s*_i,0) = u A l, 


2.5 it must be of the form A(s^_i,..., So, 0,..., 0) + (1 — A)(0, st- i, ..., Sq, 0,..., 0) for 0 < A < 1. 
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which can be expanded to the equation 



( aiSo + ■ ■ ■ + aesg-i ^ 


( a 0 So + • • • + N 


( VA \ 


&oSo + ’ ’ • + 0!£_iS£_i 




VA 

A 


+ (1 — A) 


— 





a-(<?-i)S 0 + ■ • • + aoSf-i 




\ cn-{t-i)So + • • • + &ose-i / 


\ a-£S 0 - 1 -+a_iSf_i / 


V V A ) 


By considering the first coordinate, we get the equation 

A(ais 0 + • • • + + (1 — X)va = va 

so we must have oqso + • • • + agse-i = va as well. Therefore, 

OpSO + ' ’ ’ + Ot£S (,-1 = do s O + • ’ • + OLl-\Sl-\. 

But since the game is precise, there must be an inequality for all the coefficients: a% < ao,..., ag_ < 
Oil- 1 , so 

ais 0 + • • • + < ao s o + • • • + cp!-is^-i 

because not all the Si s are 0. Thus, we have a contradiction, and k cannot be £ + 1 either. 

Thus, no such strategy S can exist, so if Mq(fc) is invertible, Ma(£:) _ 1 • 1& must have some negative 
terms. □ 

We can implement the binary search algorithm as follows. Let the lower bound, low , start as 1. Let 
the upper bound, high , start as min(a, b) + 1. 

Function LSearch(M J 4 , low , high) 
if low + 1 = high then 
I return low 
else 

k = (low + high )/2 

if MA{k ) _1 • lfc is all nonnegative then 
| return LSearch(M J 4 , k , high) 

else 

// MA(k) is not invertible or 1 •1 k has a negative entry 

return LSearch(Mq, low , k) 

end 

end 

Algorithm 1: Binary Search For Length 

By Lemmas |4.1| and |4.2| this algorithm will return the length of the optimal strategy for player A. 
We can then apply our formula to directly compute player A’s unique optimal strategy. The reverse of 
this strategy is an optimal strategy for player B. This completes the algorithm. From our results on the 
convergence of strategies, this algorithm is also able to approximate, with any desired degree of accuracy, 
optimal strategies for imprecise games. 

4.2 Recursion on Directed Graphs 

So far, our results apply to the strategy for bidding on a single turn in an all-pay bidding game. This 
assumes some prior knowledge of successor game states that allows the payoff matrix to be already 
known. Thus, to use our algorithm to compute Nash equilibria for any all-pay bidding game state, we 
need some way of first finding the payoff matrix. By noting that the payoff matrices for end states (where 
one player has already won) can be set as 0 and 1 for win and loss, we use recursion from the end states 
of the game to find the payoff matrix for an arbitrary turn. 

Consider a combinatorial game G represented as a directed graph D = (V. E) with two vertices 
marked as A and B and a token placed at some vertex of the graph. We can think of each vertex as the 
starting position of a subgame of G. Thus for player A with a chips and player B with b chips, the token 
on vertex w, we write the game as w a ,b- Let S(w) give all the vertices that can be moved to from w. 
We can compute va as follows: 

( 1 if w = A 

VA(u>a,b) = < 0 if w = B 

[ Sg ■ X ■ Sa otherwise 
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Then if A bids i and B bids j and A makes a move then A’s payoff will be 


A(i,j)= max v A (w' a _ j+ib _ i+j ) 

w’gSa(w) j j 

because A will seek to maximize his probability of winning over all of his possible sucessor states. If A 
bids i and B bids j and B makes a move then A’s payoff will be 


B(i,j)= min v A (w' a _ j+ib _ i+j ) 
w'es B {w) 


because B will seek to minimize A’s probability of winning over all of her possible sucessor states. 
Therefore 


J max (A(i, j), B(i,j)) if i < j or i = j and A has advantage 
f min B(i,j)) if i > j or i = j and B has advantage 


as each player will consider the best possible scenario if he moves and the worst possible scenario if 
their opponent moves. Then Sa, Sb can computed from this payoff matrix X , using our algorithm from 
before. 

Note this allows us to recurse up the directed graph from states A and B. first with values for those 
states, then values for the states one move away (i.e. v such that either A or B £ S(v)), then states two 
moves away, and so on. 


4.3 Complexity 

An arbitary n x n matrix can be inverted in 0(n 3 ) time using the Gauss-Jordan method. There exist, 
however, many more efficient algorithms specific to Toeplitz matrices. In particular, the Levinson-Trench- 
Zohar algorithm can solve a Toeplitz system in 0(n 2 ) time [3]. 

For an n x n matrix, the binary search algorithm requires log(n) iterations. Each iteration requires 
solving one Toeplitz system and scanning one vector for negative values. Thus, the algorithm runs in 
time on the order of log(ra) • ( 0(n 2 ) + 0(n)) = 0(\og(n)n 2 ). Thus finding an optimal strategy and 
corresponding payoff for a given payoff matrix requires time on the order of 0(log(n)n 2 ). 

A simple implementation of our recursive algorithm would take time growing exponentially with the 
depth of D. We can greatly speed up this process by storing each VA{i,j ) that is computed. Then when 
v A (i,j) must be computed again the value can be looked up rather than recomputed. In the worst case, 
the program must compute v A for every possible combination of chips at every vertex. Because the sum 
of the chips is constant, this requires at most (a + b) ■ \V\ computations. Thus, the entire algorithm 
runs in time on the order of 0(|V| • log(n)n 2 ) where n = a + b. For comparison, a linear programming 
algorithm to achieve the same results would require time on the order of 0(|U| • n 3 5 ) [3]. 
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